Fast Parallel Sorting under Logp : from Theory to Practice 3
نویسندگان
چکیده
1.1 ABSTRACT The LogP model characterizes the performance of modern parallel machines with a small set of parameters: the communication latency (L), overhead (o), bandwidth (g), and the number of processors (P). In this paper, we analyze four parallel sorting algorithms (bitonic, column, radix, and sample sort) under LogP. We develop implementations of these algorithms in a parallel extension to C and compare the actual performance on a CM-5 of 32 to 512 processors with that predicted by LogP using parameter values for this machine. Our experience was that the model served as a valuable guide throughout the development of the fast parallel sorts and revealed subtle defects in the implementations. The nal observed performance matches closely with the prediction across a broad range of problem and machine sizes. 1.2 INTRODUCTION Fast sorting is important in a wide variety of practical applications, is interesting to study from a theoretical viewpoint, and ooers a wealth of novel parallel solutions. The richness of this particular problem arises, in part, because it fundamentally requires communication as well as computation. Thus, sorting is an excellent area in which to investigate the translation from theory to practice of novel parallel algorithms on large parallel systems. In current (1993) technology, \fast parallel sorting" corresponds to a practical performance target of \sorting a billion large keys on a thousand processors Book title and editor name c 1992 John Wiley & Sons Ltd
منابع مشابه
Fast Parallel Sorting Under LogP: Experience with the CM-5
In this paper, the LogP model is used to analyze four parallel sorting algorithms (bitonic, column, radix, and sample sort). LogP characterizes the performance of modern parallel machines with a small set of parameters: the communication latency (L), overhead (o), bandwidth (g), and the number of processors (P ). We develop implementations of these algorithms in Split-C, a parallel extension to...
متن کاملModels and Resource Metrics for Parallel and Distributed Computationt
This paper presents a framework of using resource metrics to characterize the various models of parallel computation. Our framework reflects the approach of recent models to abstract architectural details into several generic parameters, which we call resource metrics. We examine the different resource metrics chosen by different parallel models, categorizing the models into four classes: the b...
متن کاملModels and Resource Metrics for Parallel and Distributed Computation
This paper presents a framework of using resource metrics to characterize the various models of parallel computation. Our framework reeects the approach of recent models to abstract architectural details into several generic parameters, which we call resource metrics. We examine the diierent resource metrics chosen by diierent parallel models, categorizing the models into four classes: the basi...
متن کاملThe collective computing model
The parallel computing model presented in this paper, the Collective Computing model (CCM), is an improvement of the well-known Bulk Synchronous Parallel (BSP) model. The synchronicity imposed by the BSP model restricts the set of available algorithms and prevents the overlapping of computation and communication. Other models, like the LogP model, allow asynchronous computing and overlapping bu...
متن کاملModeling Parallel Sorts with LogP on the CM-5
In this paper, the LogP model is used to analyze four parallel sorting algorithms (bitonic, column, radix, and sample sort). LogP characterizes the performance of modern parallel machines with a small set of parameters: the communication latency (L), overhead (o), bandwidth (g), and the number of processors (P ). We develop implementations of these algorithms in Split-C, a parallel extension to...
متن کامل